374 research outputs found
CADISHI: Fast parallel calculation of particle-pair distance histograms on CPUs and GPUs
We report on the design, implementation, optimization, and performance of the
CADISHI software package, which calculates histograms of pair-distances of
ensembles of particles on CPUs and GPUs. These histograms represent 2-point
spatial correlation functions and are routinely calculated from simulations of
soft and condensed matter, where they are referred to as radial distribution
functions, and in the analysis of the spatial distributions of galaxies and
galaxy clusters. Although conceptually simple, the calculation of radial
distribution functions via distance binning requires the evaluation of
particle-pair distances where is the number of particles
under consideration. CADISHI provides fast parallel implementations of the
distance histogram algorithm for the CPU and the GPU, written in templated C++
and CUDA. Orthorhombic and general triclinic periodic boxes are supported, in
addition to the non-periodic case. The CPU kernels feature cache-blocking,
vectorization and thread-parallelization to obtain high performance. The GPU
kernels are tuned to exploit the memory and processor features of current GPUs,
demonstrating histogramming rates of up to a factor 40 higher than on a
high-end multi-core CPU. To enable high-throughput analyses of molecular
dynamics trajectories, the compute kernels are driven by the Python-based
CADISHI engine. It implements a producer-consumer data processing pattern and
thereby enables the complete utilization of all the CPU and GPU resources
available on a specific computer, independent of special libraries such as MPI,
covering commodity systems up to high-end HPC nodes. Data input and output are
performed efficiently via HDF5. (...) The CADISHI software is freely available
under the MIT license.Comment: 19 page
A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov-Poisson equation
This paper presents an optimized and scalable semi-Lagrangian solver for the
Vlasov-Poisson system in six-dimensional phase space. Grid-based solvers of the
Vlasov equation are known to give accurate results. At the same time, these
solvers are challenged by the curse of dimensionality resulting in very high
memory requirements, and moreover, requiring highly efficient parallelization
schemes. In this paper, we consider the 6d Vlasov-Poisson problem discretized
by a split-step semi-Lagrangian scheme, using successive 1d interpolations on
1d stripes of the 6d domain. Two parallelization paradigms are compared, a
remapping scheme and a classical domain decomposition approach applied to the
full 6d problem. From numerical experiments, the latter approach is found to be
superior in the massively parallel case in various respects. We address the
challenge of artificial time step restrictions due to the decomposition of the
domain by introducing a blocked one-sided communication scheme for the purely
electrostatic case and a rotating mesh for the case with a constant magnetic
field. In addition, we propose a pipelining scheme that enables to hide the
costs for the halo communication between neighbor processes efficiently behind
useful computation. Parallel scalability on up to 65k processes is demonstrated
for benchmark problems on a supercomputer
Complexity Bounds for Block-IPs
We consider integer programs (IPs) with a certain block structure, called two-stage stochastic. A two-stage stochastic IP is an integer program of the form where the constraint matrix consists of blocks on a vertical line and blocks on the diagonal line aside. We improve the bound for the Graver complexity of two-stage stochastic IPs. Our bound of reduces the dependency from to and is asymptotically tight under the exponential time hypothesis in the case that . The improved Graver complexity bound stems from improved bounds on the intersection for a class of structurally rich integer cones. Our bound of for dimension and absolute entries bounded by is independent of the number of intersected integer cones. We investigate special properties of this class, which is complemented by the fact that these properties do not hold for general integer cones. Moreover, we give structural characterizations of this class that admit their use for two-stage stochastic IPs
Single-hole transistor in p-type GaAs/AlGaAs heterostructures
A single-hole transistor is patterned in a p-type, C-doped GaAs/AlGaAs
heterostructure by AFM oxidation lithography. Clear Coulomb blockade resonances
have been observed at T=300 mK. A charging energy of ~ 1.5 meV is extracted
from Coulomb diamond measurements, in agreement with the lithographic
dimensions of the dot. The absence of excited states in Coulomb diamond
measurements, as well as the temperature dependence of Coulomb peak heights
indicate that the dot is in the multi-level transport regime. Fluctuations in
peak spacings larger than the estimated mean single-particle level spacing are
observed.Comment: 4 pages, 5 figure
- …